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Abstract. We use death processes and embeddings into continuous time in order to analyze several 
urn models with a diminishing content. In particular we discuss generalizations of the pill's problem, 
originally introduced by Knuth and McCarthy, and generalizations of the well known sampling without 
replacement urn models, and OK Corral urn models. 



1. Introduction 

1.1. Diminishing Polya-Eggenberger urn models. In this work we are concerned with so-called 
Polya-Eggenberger urn models, which in the simplest case of two types of colors for the balls can 
be described as follows. At the beginning, the urn contains n white and m black balls. At every 
step, we choose a ball at random from the urn, examine its color and put it back into the urn and 
then add/remove balls according to its color by the following rules. If the ball is white, then we 
put a white and /? black balls into the urn, while if the ball is black, then 7 white balls and 5 black 
balls are put into the urn. The values a, 13,^,5 € Z are fixed integer values and the urn model 
is specified by the transition matrix M = ( " 5 ) ■ Urn models with r (> 2) types of colors can 
be described in an analogous way and are specified by an r x r transition matrix. Urn models are 
simple, useful mathematical tools for describing many evolutionary processes in diverse fields of 
application such as analysis of algorithms and data structures, statistics and genetics. Due to their 
importance in applications, there is a huge literature on the stochastic behavior of urn models; see for 
example [10, 15, 18]. Recently, a few different approaches have been proposed, which yield deep and 
far-reaching results [1, 3, 4, 8, 9, 19]. Most papers in the literature impose the so-called tenability 
condition on the transition matrix, so that the process can be continued ad infinitum, or no balls of a 
given color being completely removed. However, in some applications, examples given below, there 
are urn models with a very different nature, which we will refer to as ''diminishing urn models." Such 
models have recently received some attention, see for example [20, 5, 3, 21, 2, 22]. For simplicity of 
presentation, we describe diminishing urn models in the case of balls with two types of colors, black 
and white. 

We consider Polya-Eggenberger urn models specified by a transition matrix M = ( " ^ ) , and in 
addition we also specify a set of absorbing states ^ C N x N. The evolution of the urn takes place in 
the state space 5 C N x N. The urn contains m black balls and n white balls at the beginning, with 



Date: October 12, 2011. 

2000 Mathematics Subject Classification. 05A15,60F05,05C05. 

Key words and phrases. Urn models, Generating functions, Limiting distribution. 

The second author was supported by the Austrian Science Foundation FWF, grant S9608-NI3. 

1 



2 



M. KUBA AND A. PANHOLZER 



(m, n) G iS, and evolves by successive draws at discrete instances according to the transition matrix 
until an absorbing state in A is reached, and the process stops. Diminishing urn models with more 
than two type of balls can be considered similarly. 

1.2. Plan of this note and notations. There are numerous examples of diminishing urns and related 
problems in literature. In the following we present three concrete problems, the pills problem, the 
sampling without replacement urn, and the OK Corral urn model, and summarize known results. 
For all three problems presented below, and suitable generalization of them, we will use stochastic 
processes and an embedding in continuous time, in order to unify and extend known results in the 
literature concerning exact distribution laws, generalizing some results of [2, 6, 17,21, 11, 12, 13, 20, 
3] . We will denote with X (BY the sum of independent random variable X and Y. Moreover, we 
use the notations N = {1, 2, 3, . . . } and No = {0, 1,2,...}. 

1.3. The pills problem and generalizations. Consider the diminishing urn problem with transition 
matrix given by Af = ( ) , state space 5 = N x N, and the absorbing axis A = {{0,n) | n € N}. 
An interpretation is as follows. An urn has two types of pills in it, which are single-unit and double- 
unit pills, respectively. At every step, we pick a pill uniformly at random. If a single-unit pill is 
chosen, then we eat it up, and if the pill is of double unit, we break it into two halves — one half 
is eaten up and the other half is now considered of single unit and thrown back into the urn. The 
question is then, when starting with n single-unit pills and m double-unit pills, what is the probability 
that k single-unit pills remain in the urn when all double-unit pills are drawn? This problem has been 
stated by Knuth and McCarthy in [14], where the authors asked for a formula for the expected number 
of remaining single-unit pills, when there are no double-unit pills in the urn. The solution appeared 
in [7]. A more refined study was given by Brennan and Prodinger [2], where they derive exact formulse 
for the variance and the third moment of the number of remaining single-unit pills; furthermore, a 
few generalizations are proposed. The probability generating functions and limit laws for the pills 
problem and a variant of the problem have been derived in [6] using a generating functions approach. 
Furthermore, a study of the arising limiting distributions of a general class of related problems has 
been carried out in [17] using a recursive approach basically guessing the structure of the moments, 
together with an application of the so-called method of moments. However, some cases proved to 
be quite elusive using the techniques of [17]. Moreover, no simple explicit general formula for the 
probability mass function of the random variable of interest was obtain before. Furthermore, we will 
discuss in this note weighted generalizations of the pills problem, 

1.4. Sampling without replacement and generalizations. This classical example, often severing 
as a toy model, corresponds to the urn with transition matrix M = [^q ^i) with absorbing axis: 
A = {(0,n) I n € N}. In this model, balls are drawn one after another from an urn containing 
balls of two different colors and not replaced. What is the probability that k white balls color remain 
when black balls have been removed, starting with n white and m black balls? This simple urn model 
has been discussed in [6] using generating functions. Moreover, generalizations of the sampling urn 
model have been discussed in [17, 16]. 
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1.5. The OK Corral urn model. The so-called OK Corral urn serves as a mathematical model of 
the historical gun fight at the OK Corral. This problem was introduced by Williams and Mcllroy 
in [22] and studied recently by several authors using different approaches, leading to very deep and 
interesting results; see [21, 11, 12, 13, 20, 3]. Also the urn corresponding to the OK corral problem 
can be viewed as a basic model in the mathematical theory of warfare and conflicts; see [12, 13]. 

In the diminishing urn setting the OK corral problem corresponds to the urn with transition matrix 
M = (i*! V) with two absorbing axes: A = {(0, n) j n G N} U {(m, 0) | m e N}. An 
interpretation is as follows. Two groups of gunmen, group A and group B (with n and m gunmen, 
respectively), face each other. At every discrete time step, one gunman is chosen uniformly at random 
who then shoots and kills exactly one gunman of the other group. The gunfight ends when one group 
gets completely "eliminated". Several questions are of interest: what is the probability that group 
A (group B) survives, and what is the probability that the gunfight ends with k survivors of group 
A (group B)? This model was analyzed by Williams and Mcllroy [22], who obtained an interesting 
result for the expected value of the number of survivors. Using martingale arguments and the method 
of moments Kingman [11] gave limiting distribution results for the OK Corral urn model for the 
total number of survivors. Moreover, Kingman [12] obtained further results in a very general setting 
of Lanchester's theory of warfare. Kingman and Volkov [13] gave a more detailed analysis of the 
balanced OK Corral urn model using a connection to the famous Friedman urn model; amongst 
others, they derived an explicit result for the number of survivors and even local limit laws. In his 
Ph. D. thesis [20] Puyhaubert extended the results of [11, 13] on the balanced OK Corral urn model 
using analytic combinatoric methods concerning the number of survivors of a certain group. His 
study is based on the connection to the Friedman urn showed in [13]. He obtained explicit expression 
for the probability distribution, the moments, and also reobtained (and refined) most of the limiting 
distribution results reported earlier Some results of [20] where reported in the work of Flajolet et 
al. [3]. Apparently unknown to the previously stated authors was the earlier work of Stadje [21], who 
obtained several limiting distribution results for the generalized OK Corral urn, as introduced below, 
and also for related urn models with more general transition probabilities. In [21] the probability 
distributions for the most general transition probabilities are ingeniously determined by a complex 
integral. The results of Stadje were then discussed in [16], and their connection to sampling without 
replacement, a duality relation, uncovered. However, no transparent probabilistic derivation of the 
results of [16] were given before. 



where a, /3 G N, and 7 = a - p, p € Nq. Let Xn^m denote the random variable counting the number of 
remaining white balls (divided by a) when all black balls have been drawn. The probability generating 
function hn^miv) = E(f"^"''") satisfies the recurrence relation 



2. Probabilistic analysis of the pill's problem urn models 



We are interested in a generalized pill's problem with ball replacement matrix given by 




(1) 



an 



hn~l,m{v) + 



Sm 




(2) 



an + 5m 



an + 5m 



4 



M. KUBA AND A. PANHOLZER 



with hnfi = v", n > 1. We analyze Xn,m using a continuous time embedding. We start at time 
zero with n white balls and m black balls, and use two independent linear processes. The first one 
consists of n independent ordinary death processes (white balls) with death rate a. Let Wn{t) denote 
the random variable counting the number of living white balls at time t, with Wn{0) = n. The second 
one (black balls) consists of m independent modified death processes, with rate 6, where each black 
ball gives at his death birth to p new white balls with death rate a, independent of all other balls, and 
p G Nq. We denote with Bm{t) the random variable counting the number of living black balls at time 
t, with Bm{0) = m Finally, let Cm{t) denote the random variable counting the number of surviving 
white balls up to time t, which are children of black balls. Let r = mityo{Bt = 0} be the time when 
the black balls die out. Then 

both random variables are independent, due to the construction. One readily obtains the recurrence 
relation for the probability generating function hn^m{v) by looking at the time when the first particle 
dies. It is well known that the index of the variable achieving the minimum out of r independent 
exponential distributed random variables Xi, . . . with parameters Ai, . . . , A,., is given by 

= m,„{.Y, . . . . X,}) = y (-(1 - n e-^''it = 

Hence, we easily obtain that the probability that any of the type one balls dies first is given by Q,„'^g^ , 
and the opposite case happens with probability ^^f^^ ■ Moreover, if p new white balls with death rate 
a are being bom, they can be grouped with the already existing white balls, due to the memorylessness 
of exponential distributions. This leads directly to (2). 

Due to the construction of the two processes the random variables Wn{t) and Cm{i) can be de- 
composed themselves into sums of i. i. d. random variables, 

n m 
Wn{t) = Xk{t), CrrXt) = ^(0 

fe=l A;=l 

where Xk{t) denotes the indicator variable of the A;-th white ball living at time t, \ < k < n, and 
yfc(t) denote the random variable counting the number of surviving white balls up to time t, which 
are children of the /c-th black ball, 1 < k < m. 

The probability generating function of a single white ball at time t with rate 5 is given by 

^^y^kit)) ^ p{Xfc(t) <t} + v¥{Xk{t) >t} = l- e""* + we""*. 

so the probability generating function of the total number of white balls living at time t is given by 

n 

^{yW^ii)) = JJ E(i;^*W) = {l + {v- l)e-"*)'', 
fc=i 
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due to the independence assumption. The probabiUty generating function of Yk{t), assuming that the 
black ball dies before time t, is given by 

Jo (^^"^ ~ """^"0 -i^ + iv- l)e-"(*-")fdu = 6e'^'' ■ {1 + {v - l)e-"(*-"))P(i«, 

due to the fact that p independent white balls are being bom at the death of the black ball. Conse- 
quently, the probabihty generating function of the children of the m — 1 black balls, dying before 
time t, and the corresponding number of surviving child balls up to time t is given by 



m—l 



Furthermore, the density of the last remaining black ball is given by 6 ■ e"*^*, giving birth to p more 
surviving white balls. Moreover, this final ball can be any one out the m balls. AUtogether, consider- 
ing all possible final death times t, or more precisely by conditioning on the stopping time r we for 
obtain the probability generating function hn^m{v) = E(f^" '") the following result. 

Theorem 1. For arbitrary a, (5 G N and p G No, the probability generating function of the random 
variable Xn^m counting the number of remaining white balls (divided by a) when all black balls have 
been drawn, M = (^~" ^ = a ■ p, is given by 

hn,m{v) = J (1 + (^^ - l)e-°*)'' • ( J Se-^"" • (1 + (i; - l)e-"(*-"))P(inj • vPm6e~^'dt. 

The results above unify and extend the known results of [6]. Moreover, it allows to largely extend 
the results of [17] concerning the structure of the moments, as stated below, and also to give a com- 
plete analysis of the limit laws. Note that by setting p = one also gets the probability generating 
function for a certain generalized sampling without replacement urn model. From the result above we 
will derive a closed formula for the s-th factorial moment E(Xf^m) of Xn,m = Xn,m — p, for a ^ d 
such that a£ — d ^ 0; the special case a = d has already been treated in [6]. Note that the factorial 
moments E(X^^m) of X„ „ are recovered using the binomial theorem for the falling factorials 



EiX^J = E{{Xr^,m+Pn = ^ C^^Xi 

i=0 



J/ 



Theorem 2. The factorial moments of the random variable Xn,m = ^n,m — P are given in terms of 
a generalized beta integral, 



f'q'f+^'-'flil-qf-'f'dq. 

/in 
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In particular, for we obtain for p = 1 the simple expression 



\s-e-i \ i ) 



From the result above we will derive a closed formula for the s-th factorial moment E(X|_rrt) of 
- p, for a ^ d such that ai - d 0; the special case a = d has already been treated 
in [6]. Our starting point is the following expression for K{X^^rn)- 

s ^n,'m {"^^ 



V'P 

where denotes the operator which evaluates at u = 1, and the differentiation operator. By the 
binomial theorem we have 

rt , ^ JL /„\ p-tS _ p-a£t 



f be'^"" •(! + (?;- l)e""(*-"))Pdn = ^^{f\{v - 1) 



ia-b 



£=0 

Consequently, using the multinomial theorem, we obtain 

•'0 u,. I ... I I. — ™ , \ Oi • • • ) IJ „_„ \ / 



fcoH Vk^=m-\ ^ ' f=0 



Using 
we get 



^1 + _ l)e-"*)"e-'5* = Y.{^\v-\ 



)ig-{"i+'5)* 



Since 



jmy-M f m-1 \ ULAv' p n'fa_iv+EL.i' 

/•oo P 

lo, j + ELo^^^7^«' 



we get the simpler expression 

s 



A;f>0 

/■oo P 

X / e-(°^+^)*rr(e-*'^-e-'^«)^*(it. 



Now we use the substitution q = e in order to convert the integral above into a beta-function type 
integral, which proves our result. 



ON DEATH PROCESSES AND URN MODELS 



7 



2.1. Higher dimensional urn models. One can readily extend the 2x2 transition matrix (1) to 
higher dimensions, 



M 



-at 
P2ai — a2 
P3Ct2 —as 



\ 



pj.Cij.--i —aj. / 



with aj E N and pi € Nq. We consider the distribution of the random vector Xn = (Xn ^ , • • • , ^^), 
which counts the number of type 1 up to type r — 1 pills when all pills of r units are all taken, starting 
with rii pills of i units, i = 1, . . . , r. One may use similar arguments to the 2x2 case to obtain the 
following result. 

Theorem 3. The probability generating function ofJ^n given by 



/in(v) 



S'r(i, v) 



Ur — l 



r-1 



Pr 



Here fj{t, v) denotes a sequence of functions defined by fo{t, v) = 1, and 

fj {t, v) = t;,e-"^* + /* a,e-"^"^ (/,„i (t - Uj , v))^^ du, , j > 1, 
Jo 

with gj{t, v) = fj{t, v) - Vje-°'^\ 

2.2. General weight sequences. One may also obtain the result for X„ ^ using a slightly different 
model. Our first process still one consists of n independent ordinary death processes (white balls) 
with death rate a. However, concerning the second process, we consider a single modified death 
process B{t) with death rates 9m, ■ ■ ■ starting with B{0) = m. At each transition of B{t) exactly 
p white balls are being born, modelled by p independent ordinary death processes (white balls) with 
death rate a. Consequently, one obtains the alternative description 

POD 

hn,M= / (l + (^-l)e-"*)"-p„(t,f)(it. 
Jo 

where Pm{t, v) denotes density of B{t) dying out before time t, with variable v marking the living 
white balls at time t, 



e"^'""'"(l + {v- l)e 



-a{t-Um)\P 



'{l + {v-l)e 



-a(t — Um—Um-l)\P 



(3) 







Note that for 9^ = 6 ■ k, one reobtains the earlier result. 
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3. Probabilistic analysis of sampling without replacement and OK Corral type 

URN models 

We will generalize the sampling without replacement urns, and OK Corral urn models by analyzing 
two urn models associated to sequences of positive real numbers A = (a„)„gN and B = (/3m)mgN- 
The dynamics of the discrete time process of drawing and replacing balls is as follows: At every 
discrete time step, we draw a ball from the um according to the number of white and black balls 
present in the um, with respect to the sequences [A, B), subject to the two models defined below. The 
choosen ball is discarded and the sampling procedure continues until one type of balls is completely 
drawn. 

Um model I (Sampling with replacement with general weights). Assume that n white and m black 
balls are contained in the um, with arbitrary n, m € N. A white ball is drawn with probability 
"n/ (on + /3m)> and a black ball is drawn with probability /3m/ ("n + Pm)- Additionally, we assume 
for um model I that ao = /^o = 0. 

Um model II (OK Corral urn model with general weights). For arbitrary n,m £ N assume that n 
white and m black balls are contained in the um. A white ball is drawn with probability (a„ + 
(3m), and a black ball is drawn with probability a„/ (a„ + /3m). 

The absorbing states, i.e. the points where the evolution of the um models stop, are given for both um 
models by the positive lattice points on the the coordinate axes {(0, n) | n > 1} U {(m, 0) | n > 1}. 
These two urn models generalize two famous Polya-Eggenberger um models with two types of balls, 
namely the classical sampling without replacement (I), and the so-called OK-Corral urn model (II), 
described in detail below. We are interested in a probabilisitic derivation of the distribution of the 
random variable Xn^m, counting the number of white balls, when all black balls have been drawn. In 
order to simplify the analysis we note that there only exists a single one um model. 

Lemma 1 ([16]). Let '^{X^^m\A,B,i] = ^} denote the probability that k white balls remain when all 
black balls have been drawn in urn model I with weight sequences A = (a„)„gN> B = {(3m)ni&n 
and IP{^„ m[AB II] ~ ^} corresponding probability in urn model II with weight sequences 
A = (a„)neN, B = {^m)mm- The probabilities P{X„^m_[^ ^,7] = k} and F{X^ ,^^^^^^ jj^ = k} 
are dual to each other, i.e. they are related in the following way. 

^{Xn,m,[A,B,I] = k} = ^{X^^^^^^^^jj^ = k} , 

for ttn = /3m = n, m € N, and k > 0. 

Without loss of generality, we will restrict ourselves to the um model I. Note that the recurrence 
relation for the probability generating function hn,m{v) = E(t;"'^" '") of X„ m is given by 

hn,m{v) = hn-l,m{v) H ^ /j,n,m-l(^), n,m>l, (4) 

with initial values hn,o{v) = v"', ho^m{v) = 1 n,m > 0. 
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3.1. Probabilistic embedding. We use a probabilistic approach, embedding the discrete-time model 
into a continuous-time model. The basic idea is as follows. We consider two independent death pro- 
cesses X{t), and Y(t), which stop at 0. Their death rates are are defined using the weight sequences, 
A = (a„)„gN, B = (/3m)mGN- the death rates of X{t), starting with X{0) = n are a„, . . . , ai, and 
the death rates of Y{t), starting with 1^(0) = m are /3m, ■ ■ ■ , Pi- For the sake of convenience we set 
Po = 0. We can model the random variable X„ m of urn model / by looking at the distribution of 
Cn,m = starting with X{0) = n, where r denotes the time of the process Y(t) dying out, 

T = inf {t > : Y{t) = 0}. By conditioning on the first transition of the two processes one directly 
obtains the recurrence relation (4) for E(t''-^" '"), which proves that Xn,m and Cn,m have the same 
distribution. Now things are simple. The probability that the process X{t) = k, is according to the 
definition given by the iterated integral 

rt l-t—Un 

¥{X{t) =k}= a„e-°""" / a„_ie-""-i""-i . . . 



Jo 

t~Un + 2 







This integral can be evaluated. 



h=k+i ^ h=k ^^^.^1^ ^ ^' 



which can easily be checked by induction. This result is covered in standard textbooks or lecture 
notes, it's derivation is usually based on the Kolomogorov equation and an application of the Laplace 
transform. The exact distribution of r is given by 

rt l-t — Um I't — Um U2 

P{r <t}= Pme-^'-'^'-dum / /3™_ie-'''"-i"™-i . . . / ^e-^^^idui . . . dum- 

Jo Jo Jo 

One obtains the closed formula 



(Hi \ ifL 



-Pit 

using the convention /3o = 0. Hence, then density function of the stopping time r is given by 



d ^ m s m p^t 



dt 
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Considering all possible times when the second process dies out leads to the integral representation 

^{Xn,rn = k} = {t < t]¥ {X {t) = k) dt 

m \ / n \ n m ^ 



The result above can be simplified in two different ways using the partial fraction identities 

1 " 1 



(V) 



1 A 1 



(8) 



Consequently, we have obtain a transparent probabilistic proof of the following result. 

Theorem 4 ([16]). The probability mass function of the random variable Xn^m, counting the number 
of remaining white balls when all black balls have been drawn in urn model I with weight sequences 
A = {an)neN> B = (/3m)mGN. is for n,m > 1 and n > k > 1 given by the explicit formula 

m n m ^ 

m n n ^ 

Il/^Of n «0E^^^ vr^ ; 

h=i h=k+i £=k ( YYj=k{oij -ai))[ Hi^ilft + 

assuming that Oj ^ ae and f3j ^ /Si, 1 < j < i < oo, and that oq = 0. 

It can be shown that the result above is also valid for A: = 0. Moreover, by the duality of the two 
urn models, one also gets the corresponding result for the urn model II, OK-Corral type urn models, 
by switching to weight sequences, A = (l/a„)„gN, B = (l//3m)meN- 

3.2. Sums of independent exponential random variables. Of course, the formulas (5), (6) stated 
before do not come as a surprise, since one can take yet another viewpoint. The time r until the 
second process Y(t) dies out has the same distribution as the sum of m independent exponential 
distributed random variables e^- with parameters /3m, • • • , /3i stemming from the death rates of the 
process. Hence, 
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where e/3. denotes an exponential distribution with parameter and the density is simply the formula 
stated in (6). Furthermore, the distribution of X{t) can also be modelled by k independent random 
variables: let 



i=k+l 

If ¥{X{t) = k}, then the k transitions of the process X have occured before t, and no more transition 
afterwards. Hence, 

F{Xit) = k} = F{9<t,e + e^,>t}= n "0 E ( ^e-"'^'-^Uu 

n (nLfc+i«/.)e-"'^* n (nLfc+i«h)e-"^* 

1 IYj=k{aj - ah) ^ {ah - at) ni=fc+i(«i - "/i) 
which simplifies to (5) after an application of (8). 
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